🔍 Data briefing

The data contains total 110 valid observations(remove the rows where volume is equal to zero all the time) within 22 specific items spreading in 5 regions(ADE BRI MEL PER SYD).

Here is the glimpse of the converted data in long format :

## tibble [4,730 × 15] (S3: tbl_df/tbl/data.frame)
##  $ Item & Region    : chr [1:4730] "1234ADE" "1234ADE" "1234ADE" "1234ADE" ...
##  $ Region           : chr [1:4730] "ADE" "ADE" "ADE" "ADE" ...
##  $ Item Code        : Factor w/ 22 levels "1234","1235",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Item Description : chr [1:4730] "Item 1234" "Item 1234" "Item 1234" "Item 1234" ...
##  $ Group            : chr [1:4730] "10PB" "10PB" "10PB" "10PB" ...
##  $ Supplier         : chr [1:4730] "1YA" "1YA" "1YA" "1YA" ...
##  $ Pack Qty         : num [1:4730] 60 60 60 60 60 60 60 60 60 60 ...
##  $ Pack cubic metres: num [1:4730] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
##  $ E.O.Q.           : num [1:4730] 1200 1200 1200 1200 1200 1200 1200 1200 1200 1200 ...
##  $ Cost             : num [1:4730] 5.97 5.97 5.97 5.97 5.97 5.97 5.97 5.97 5.97 5.97 ...
##  $ Month            : Factor w/ 12 levels "Jan","Feb","Mar",..: 10 11 12 1 2 3 4 5 6 7 ...
##  $ Year             : chr [1:4730] "2020" "2020" "2020" "2021" ...
##  $ Volume           : num [1:4730] 960 660 840 1260 120 900 840 1020 480 780 ...
##  $ Yearmonth        : mth [1:4730] 2020 Oct, 2020 Nov, 2020 Dec, 2021 Jan, 2021 Feb, 2021 Ma...
##  $ Category         : Factor w/ 3 levels "High","Middle",..: 1 1 1 1 1 1 1 1 1 1 ...

💡 Pattern learning

📈 Historical pattern

If displaying all data at once on the canvas will result in visual clutter and make it difficult to analyze the data intuitively, therefore, in the drawing part, only 10 sampled products will be displayed. However, in the backend calculations, all items will still be considered.the principle of the sample selection will based on the overall average volume of goods,by choosing the top three, middle four, and bottom four items, below is the item code I select for example:

the plot 1 of overall movement capture the square-root transformation of sampled product’s volume over time:

Figure 1: he overall movement of AMES sample products

  • For the hot-selling product(1234,1238 and 1271),it is noticeable that they are more susceptible to non-seasonal influences , with more volatile fluctuation and regional differentiation in time series.– Product Z: erratic sku.

  • For niche product(1258,1259 and 1260) are tend to have a flat line sticked to the zero baseline across different periods and regions, their volumes are relatively stable. – Product X: stable sku.

📊 Seasonal pattern

the plot 2 showcase overall movement in subseries breakdown.

Figure 2: The overall movement in monthly breakdown

  • Except for hot-selling product, other products don’t have noticeable ebbs and flows.

  • As a new launch product, 1271 is still under expansion stage, which is prominent on Jul and Aug.

  • 1234 and 1271 are parallel in most of region & month combo, with similar endogenous/exogenous impact factors.

  • Region disparity: Sydney and Melbourne has the most conspicuous variability in year-to-year monthly change, vice versa in Adelaide.

💡 Demand forecasting

✏️ STL decomposition

STL decomposition, or Seasonal and Trend decomposition using Loess, is a statistical method used for decomposing a time series into three components: seasonality, trend, and remainder (residual).

Here’s a breakdown of each component:

  • Seasonal Component : This represents the repeating pattern or seasonality in the data. It captures variations that occur at regular intervals, such as daily, weekly, monthly, or yearly patterns.

  • Trend Component : This captures the long-term movement or direction of the time series data. It represents the underlying trend or tendency of the data over time, ignoring the seasonal fluctuations.

  • Remainder (Residual) Component : This component captures the random fluctuations or noise that cannot be attributed to the seasonality or trend. It represents the variability in the data that remains after removing the seasonal and trend components.

The STL decomposition plot 3 reports convert the volume into trend, cyclical pattern(adjusted/unadjusted) and remainder respectively.

Figure 3: The STL decomposition of AMES sample products

-Trend: Majority of products maintains dynamic equilibrium and recently gradually taper off.

-Seasonality: Hot-selling has a quarter cyclical pattern following bimodal or multimodal Distributions :The peak normally took place on winter season while summer season is normally trough season.

-Remainder: High uncertainty on Sydney and Melbourne, vice versa in Perth and Adelaide.

As for the middle-class, since the overall plotting zoom in on the result on those top-selling products, we barely cannot observe valuable information on confusion product, so I extract their performance solely on plot 4:

Figure 4: The STL decomposition of AMES confusion products

We can notice their cyclical pattern is varied and incorporated for exceptional cases:

  • majority is multivariate peaks but 1272 in Perth only get single peak on Aug.

  • majority in Melbourne and Sydney region still maintain certain seasonality except 1247

Conclusion on model selection

Hot-selling product or region in Sydney and Melbourne: ETS/ARIMA

Niche product or region in Adelaide: 12 month rolling average

Middle(confusion) product: ETS/ARIMA comparison

📑 model forecasting

ETS/ARIMA

The ETS model, standing for Error, Trend, and Seasonality, is a type of time series forecasting model commonly used in statistics and econometrics. It decomposes a time series into three components:

  • Error (E) : This component represents the random variation or noise in the data that cannot be explained by the trend or seasonality. It is often assumed to follow a normal distribution with mean zero.

  • Trend (T) : This component captures the long-term movement or direction of the time series data. It represents the underlying trend or tendency of the data over time. The trend component can be modeled using various methods, such as exponential smoothing or linear regression.

  • Seasonality (S): This component represents the repeating patterns or cycles in the data that occur at fixed intervals, such as daily, weekly, monthly, or yearly patterns. Seasonality can be modeled using methods like seasonal decomposition or seasonal exponential smoothing.

The ARIMA model, which stands for AutoRegressive Integrated Moving Average, is another popular time series forecasting model commonly used in statistics and econometrics. It is a combination of three components:

  • AutoRegressive (p) Component: This component models the relationship between an observation and a certain number of lagged observations (i.e., its own past values). It captures the linear dependence between the current value and its past values.

  • Integrated (d) Component: This component represents the differencing of the original time series data to make it stationary. Stationarity is a key assumption in many time series models, including ARIMA. The integrated component accounts for the number of differencing operations needed to achieve stationarity.

  • Moving Average (q) Component: This component models the relationship between an observation and a residual error from a moving average model applied to lagged observations of the time series. It captures the dependence between the current value and the past forecast errors.

The formula can be represented as:

\[ Y_t =c+ϕ_1Y_{t−1}+ϕ_2Y_{t−2}+...+ϕ_pY_{t−p} +ε +θ_1ε_{t-1}+θ_2ε_{t-2} +...+θ_pε_{t-p}\] Where

1). \(ϕ_1,ϕ_2...ϕ_p\) are the autoregressive coefficients.

2). \(θ__1,θ__2...θ__p\) are the moving average coefficients.

3). \(ε_t\) is the error term at time 𝑡

The plot 5 exhibits the performance of ETS& ARIMA model prediction in region breakdown:

Figure 5: The ETS&ARIMA model predicton of AMES sample products

In this sample, ETS model is more proficient in capture dynamic change in upcoming volume compare with ARIMA model(most of them are horizontal lines). The potential reason is The parameters of ARIMA model is automatically selected by computer and it is not that robust since the trend is un-stabilized.

Below is the plot 5 integrates historical records and predictive value in ETS model for hot selling product in the same canvas, so we can notice excellent fits in the hot selling item’s data properly.

Figure 6: The historical trend + ETS model prediction in hot selling product

As for confusion product, the performance also get the constraint by residual that cannot explained by the model. E.g. 1246 in Brisbane and Sydney.

Figure 7: The historical trend + ETS model prediction in confusion product

The table reports the model result for all relevant data by ETS/ARIMA model forecasting, including their 90% and 95% safety stock,sigma and safety stock adjustment(assuming is shipment leadtime is 45 days).

Rolling average

The plot 8 exhibits the result of 12 month rolling model prediction in region breakdown as a stationary line with minor:

Figure 8: The 12 month rolling average model predicton of AMES sample products

When we concat with the historical data, it proves our assumptions:

  1. In Adelaide region and niche product, the prediction is aligned with historical data keep at certain level.

  2. However, for confusion product in other regions, it gives a gradual ascend slope and fails to acquire the the peak and trough. So in this particular sample, except for region in Adelaide, we are keen to apply ETS method to confusion products.

Figure 9: The historical trend + 12 month rolling average prediction

The table reports the model result for all relevant data by ETS/ARIMA model forecasting, including their 90% and 95% safety stock,sigma and safety stock adjustment(assuming is shipment leadtime is 45 days).

💡 Decision model

🚀 Update

To cascader the result to the Decision model, here is the tailor update for this decision model solution:

  • Take the result of 2024 May safety stock

  • The safety stock will compare with Economic Order Quantity (EOQ) of the sku

  • CBM = Safety stock*(Pallet CBM/Pallet quantity)

  • No product air volume for calculation

  • For sku not covered in sample list, temporary use ETS prediction

🧭 Backend logic

Node 1: Full container analysis

  • (½)RDC rule:

a). Region is “MEL” and cbm>0 : “NDC check” – temporary stage

b). Except for “MEL” region:

    1.single cbm >= container threshold: "RDC: FCL for single sku"
    
    2. single cbm <= container threshold & remaining space is within [0,5%]:"RDC: FCL for single sku by adjusting the              volume quantity"
    
    3.single cbm <= container threshold  but can do consol cbm: "RDC: FCL for consolidated sku"
    

c). Consol CBM <= container threshold ~“NDC check” -Preliminary stage, go for NDC rule validation

d). Not applicable value: NA

e). CBM<=0, “No or wrong predictive orders detected”

  • NDC rule:

a). single cbm >= container threshold: “NDC: FCL for single sku”

b). single cbm <= container threshold & remaining space is within [0,5%]:“NDC: FCL for single sku by adjusting the volume quantity”

c). single cbm <= container threshold but can do consol cbm: “NDC: FCL for consolidated sku”

d). All conditions are unmatched : “Other strategy to wait for FCL consolidation”– Rare case in actual practice

⌛ Sample testing

Here is the frequency report prediction result and the detail table:

## 
## NDC: FCL for consolidated sku       NDC: FCL for single sku 
##                            65                             3 
## RDC: FCL for consolidated sku       RDC: FCL for single sku 
##                            28                            12 
## RDC: revise for cost analysis 
##                             2